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A new family of high-resolution multivariate 

spectral estimators 

Mattia Zorzi 

Abstract 

In this paper, we extend the Beta divergence family to multivariate power spectral densities. Similarly 
to the scalar case, we show that it smoothly connects the multivariate Kullback-Leibler divergence with 
the multivariate Itakura-Saito distance. We successively study a spectrum approximation problem, based 
on the Beta divergence family, which is related to a multivariate extension of the THREE spectral 
estimation technique. It is then possible to characterize a family of solutions to the problem. An upper 
bound on the complexity of these solutions will also be provided. Simulations suggest that the most 
suitable solution of this family depends on the specific features required from the estimation problem. 

Index Terms 

GeneraUzed covariance extension problem. Spectrum approximation problem, Structured covariance 
estimation problem. Beta divergence. Convex optimization 

I. Introduction 

The recent development of THREE-like approaches to mukivariate spectral estimation has 
triggered a renew interest for multivariate distance measures (or simply divergence indexes) 
among (power) spectral densities, [IJ. In the THREE approach, the output covariance of a 
bank of filters is used to extract information on the input spectral density. More precisely, 
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the family of spectral densities matching the output covariance matrix is considered and a 
spectrum approximation problem, which "chooses" an estimate of the input spectral density 
in this family, is then employed. The choice criterium is based on finding the spectral density 
which minimizes a divergence index with respect to an a priori spectral density. Note that, 
the problem of parameterizing the family of feasible spectral densities may be viewed as a 
generalized covariance extension problem [[21, [[3||, 01 (111, 0. The key feature for these 
estimators concerns the high resolution achievable in prescribed frequency bands, in particular 
with short data records. Significant applications to these methods can be found in H^o robust 
control [!8l,[l9l|, biomedical engineering [fTOl, and modeling and identification [fTT| . [|T2|. [|T3l . 

The most delicate issue for this theory deals with the choice of the divergence index. In fact, 
the corresponding solution to the spectrum approximation problem (that heavily depends on the 
divergence index) must be computable and possibly with bounded McMillan degree. Accordingly, 
it is important to have many different indexes available in such a way to choose the most 
appropriate index in relation to the specific application. The THREE estimator, introduced by 
Byrnes Georgiou and Lindquist in (Ml, has been extended to the multichannel case by suggesting 
different multivariate divergence indexes, (T5l , (T6ll , (TTl . In particular, Georgiou introduced a 
multivariate version of the Kullback-Leibler divergence, (TSl . which has been frequently used 
within information theory, and a multivariate extension of the Itakura-Saito distance has been 
recently presented by Ferrante et ah, (TTl . The latter metric has an interpretation in terms of 
relative entropy rate among processes. Finally, it is worth to note that the output covariance is 
not available in a THREE-like spectral estimation method. Indeed, we need to estimate it by 
using a collection of sample data generated by feeding the filters bank with the signal whose 
spectral density is to be estimated. Moreover, the family of spectral densities matching the 
estimated output covariance must be non-empty. This covariance estimation task is accomplished 
by solving a structured covariance estimation problem, (T8l , (T9ll . Therefore, a THREE-like 
spectral estimation procedure consists in solving a structured covariance estimation problem and 
then a spectrum approximation problem. 

The main results of this paper are three. Firstly, we extend to the multivariate case the Beta 
divergence family (introduced for the scalar case in (20l ) which smoothly connects the Kullback- 
Leibler divergence with the Itakura-Saito distance. It is worth mentioning that the Beta divergence 
family for scalar spectral densities has been widely used in many applications: Robust principal 
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component analysis and clustering ||2T]| . robust independent component analysis [|22l . and robust 
nonnegative matrix and tensor factorization [|23l . [|24l . 

Secondly, we consider a spectrum approximation problem which employs the multivariate 
Beta divergence family. It turns out that it is possible to characterize a family of solutions to the 
problem with bounded McMillan degree. Moreover, its limit coincides to the solution obtained 
by using the Kullback-Leibler divergence. 

Finally, we tackle the related structured covariance estimation problem which can be viewed 
as the static version of the previous spectrum approximation problem. Also in this case, a Beta 
matrix divergence family for covariance matrices, leading a family of solutions to the structured 
covariance estimation problem, may be introduced. 

The paper is outlined as follows. Section [ll] introduces THREE-like spectral estimation meth- 



ods. Section III presents the new extension to the multivariate case of the Beta divergence 



family. In Section IV the corresponding spectrum approximation problem is introduced. More 
precisely, we derive the solution thanks to the means of the convex optimization. In Section 
[V] a non trivial existence result for the dual problem is established. Then, in Section VI a 



matricial Newton algorithm to efficiently solve the dual problem is presented. In Section VII 



some comparative examples are given: We test the different features of the found solutions. 



Section VIII is devoted to the estimation of structured covariance matrices by using the Beta 
matrix divergence family. Finally, in Section |IX] we propose an application to the estimation of 
multivariate spectral densities which employs the resulting THREE-like estimator. 



II. THREE-LIKE SPECTRAL ESTIMATION 

Let us consider an unknown zero mean, m-dimensional, ]R™-valued, purely non-deterministic, 
stationary process y = {y^; k E Z} with spectral density ^l^e^"^) defined on the unit circle T. 
Assume that the a priori information on is given by a prior spectral density ^ G §™(T). Here, 
§!^(T) denotes the family of bounded and coercive M"^^™- valued spectral density functions on 
T. Then, a finite-length data yi . . .y^ generated by y is observed. We want to find an estimate 
$ G §™(T) of Q by exploiting ^ and yi . . .y^- This spectral estimation task is accomplished 
by employing a THREE-like approach which hinges on the following four elements: 

1) A prior spectral density ^ G 
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2) A rational filter to process the data 

G{z) = {zI-A)-^B, 



(1) 



where A E M"^" is a stability matrix, B G M"^™ is full rank with n > m, and {A, B) is 
a reachable pair; 

3) An estimate S, based on the data yi . . . i/n, of the steady state covariance S = > of 
the state of the filter 

Xk+i = Axk + Byk] (2) 

4) A divergence index S between two spectral densities. 

According to the THREE-like approach, an estimate $ G of Vt is given by solving the 



minimize 5($||\E') over the set 

$ G §™(T) I I G^G* = T.\ . 



problerrQ 



(3) 

Note that \E' is generally not consistent with S, i.e. / G^G* ^ E. Hence, we have a spectrum 
approximation problem. The parametrization of all spectral densities satisfying constraint in ([3]) 
may be viewed as a generalized moment problem. For instance, the covariance extension problem 
may be recovered by setting 

1 T 



G{z) 



^ J-m • • • ^ J-n 



(4) 



In this case, the state covariance has a block Toeplitz structure: 

So Si S2 . . . S„_i 
S-f So Si •. S„_2 

yT 

Z^2 .... 



(5) 



'Here and throughout the paper, integration, when not otherwise specified, is on the unit circle with respect to the normalized 
Lebesgue measure. Moreover, a star denotes transposition plus conjugation. 
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A. Feasibility of the problem 

The first issue arising with the previous spectrum approximation problem concerns its feasi- 
bility, i.e. the existence of $ G satisfying the constraint in (jsj) for a given S. To deal with 
this issue, we first introduce some notation: Q„ C M"^" denotes the n(n + l)/2-dimensional real 
vector space of n-dimensional symmetric matrices and Q„ + denotes the corresponding cone of 
positive definite matrices. We denote as V(§!J!) the linear space generated by §™(T). Finally, we 
introduce the linear operator 

r : v{^^)-^Qn 

$ ^ y G^G*. (6) 

In [|25l (see also [fT6l '). it was shown that a matrix P E Qn belongs to the range of F, denoted 
by Range F, if and only if there exists H E M*"^" such that 

P - APA^ = BH + H^B^. (7) 

An equivalent condition, [[T8l , is that the kernel of the linear operator 

V : Qn^Qn 

Q ^ niiQ - AQA^)Ui (8) 

contains P, namely V{P) = 0. Here, := I — B{B^ B)^^ . It turns out that the spectrum 
approximation problem is feasible if and only if S G Range F fl Q„,+, [l25l . [fT6ll . Let xi . . .xn 
be the output data generated by feeding the filters bank with the finite-length data yi . . .i/n- An 
estimate of S is therefore given by the sample covariance matrix f^c '■= ^ J2k=i ^kxj^ which 
is normally positive definite. It may not, however, belong to Range F. Accordingly, we need to 
compute a new estimate S G Range F which is positive definite and "close" to the estimate Sc-. 
Hence, we have to solve a structured covariance estimation problem which lead us to consider 
the following convex optimization task. 
Problem 1: Given > 0, 

minimize T'(P||Sc') over the set 

{P E Qn,+ I V{P) = 0} . (9) 
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Here, V is a suitable divergence index among (positive definite) covariance matrices. Problem [T] 
can be efficiently solved, [fTSl. by considering the information divergence among two Gaussian 
densities with covariance P and Q, respectively, [|26ll : 



Vi{P\\Q) := ^ tr[log(Q) - log(P) + PQ-' - I]. (10) 

Another approach characterizes S in terms of the filter parameters and the sequence of the 
covariance lags of y, [fT9l . Once we have S in such a way that the spectrum approximation 
problem is feasible, we can replace G with G = T.^^G and (A, B) with {A = T,~^AT.^,B = 
T.^^B). Thus, the constraint may be rewritten as J G^G* = I. Accordingly, from now on we 
assume that the spectrum approximation problem in ([3]) is feasible and we consider the following 
equivalent formulation. 

Problem 2: Given ^ G §™(T) and G{z) = {zl - A)-^B such that / G Range T, 

minimize 5(<I>||\E') over the set 

$ G I f G^G* = l\ . (11) 



B. Choice of the divergence index 

A divergence index among spectral densities in §™(T) must satisfy the following basic property 
for all -l*,^ G §';'(¥): 

S{^\\^) > 

5($||^) = if and only if $ = ^. (12) 

Moreover, the corresponding Problem |2] should lead to a computable solution, by typically solving 
the dual optimization problem. In [fTSl , a Kullback-Leibler divergence for multivariate spectral 
densities with the same trace of the zeroth-moment has been introduced 

5xLo($||^) = j tr[$(log($) - log(^))] (13) 

where log(-), whose definition will be given in Section III-B[ is the matrix logarithm. This 



divergence is inspired by the Umegaki-von Neumann's relative entropy [|27ll of statistical quantum 
mechanics. Moreover, ( [T3] ) may be readily extended to the general case, see [|28l for the scalar 
case, 

5kl($||^) = j tr[<l>(log(<l>) - log(^)) - $ + ^] (14) 
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and 5/4-2,0(^11^) = '5kl(^||^) when Jtr$ = J tr \E'. The resulting solution to the spectrum 
approximation problem is, however, not rational, even when ^ = J. On the contrary, by 
considering the multivariate extension of the Itakura-Saito distance 

Si^mm = j tr[log(^) - log(<D) + $$-1 - /], (15) 

a rational solution is obtained, [fTTll . We will show in the following Section that the divergence 



indexes (14) and (15) belongs to the same multivariate Beta divergence family. Moreover, this 
family leads to a family of solutions to the spectrum approximation problem. 

Observe that, it is also possible to rewrite Problem [2]by considering iSklI^II'^')- The resulting 
solution is, however, only computable when ?/ is a scalar process [fT4ll , [|29l , or \1' = /, [5], 
ll30ll . [[TSl . Finally, we mention that there exists another multivariate distance, called Hellinger 



distance, which gives a rational solution to Problem [2| [fT6l. 

III. Beta divergence family for spectral densities 

In this section we extend the notion of Beta divergence (family) for scalar spectral densities, 
firstly introduced in [20.1 and [|22l . to the multivariate case. 

A. Scalar case 

We recall the definition of the scalar Beta divergence by adopting the same notation employed 
in [|28l . First of all, we need to introduce the following function 




1 



(16) 



y , 

log(a;) - log(?/), c=l 
which is referred to as generalized logarithm discrepancy function throughout the paper. Notice 
that log^ is a continuous function of real variable c and logc(a;, y) = if and only if x = y. The 
(asymmetric) Beta divergence between two scalar spectral densities $, \E' G §J'|_(T) is defined by 

5^($||^) := -iy"[$'^logi (^^$^) +<l'^-^^] 

= /[^(<^'^-$^''"')-^(<f^-*^)] (IV) 
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where the parameter /3 is a real number. For /3 = and /3 = 1, it is defined by continuity in the 
following way 

lunS^m^) = 5kl($||^), (18) 



where Sis and Skl are the scalar versions of (15) and ([14]), respectively. Moreover, the Beta 



divergence is a continuous function of real variable /3 in the whole range including singularities. 
Thus, it smoothly connects the Itakura-Saito distance with the Kullback-Leibler divergence. Since 



Sp is a divergence index, property (12) is satisfied. Finally, Sp is always strictly convex in the 



first argument, but is often not in the second argument. 
B. Multivariate case 

Likewise to the scalar case, we start by introducing the generalized multivariate logarithm dis- 
crepancy. To this aim, recall that the exponentiation of a positive definite matrix X to an arbitrary 
real number c, is defined as := Udiag{dl, . . . , d'^)U^ where X := Udia,g{di, . . . , dm)U^ is 
the usual spectral decomposition with U orthogonal, i.e. UU^ = I, and diag((ii, . . . ,dm) > 
diagonal matrixj^We are now ready to extend the definition of generalized logarithm discrepancy 
to the multivariate case 

Qm,+ X Qm,+ ~^ Qm 

^(Xi-r-1-/), CGM\{1} ^^^^ 

iog(x)-iog(y), c=i 

where log(X) = f/diag(log((ii), . . . ,\og{dm))U^ is the matrix logarithm of X. 

Proposition 3.1: The generalized multivariate logarithm discrepancy is a continuous function 
of real variable c in the whole range. Moreover, log^{X,Y) = if and only if X = Y. 

Proof: By definition X^^^ and Y^^^ are continuous function of real variable c. Thus, the 
function log^(X, Y) of real variable c is continuous in M \ {1}. It remains to prove that log^ is 
continuous in c = 1. This is equivalent to show that limc_>i log^{X, Y) = log(X) — log(F). Let 
X = Udi&g{di, . . . , djn)U'^, then 

iX'-^ -I) = f/diag(^i ^ ^)f/^. (20) 




1 — c 1 — c 1 — c 



^It is also possible to take the exponentiation of positive semidefinite matrices when c 7^ 0. 
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Taking the limit for c — > 1, we get 



lim^—iX^-" - I) 

= C/diag(lim ^ -, . . . , lim ^ -)U'^ 

c-^l 1 — C c^>l 1 — C 

= [/diag(log(cii), . . .,log{dJ)U^ = log(X). (21) 



Accordingly, 



limlog,(X,F) 

= mn[^(x--/)-^(y--7)]y- 

= log(X) - log(F) (22) 

which proves that log^ is continuous in c = 1. Concerning the last statement, it is straightforward 
that X — Y implies log^{X,Y) — 0. On the contrary, \og^{X,Y) — 0, with c 7^ 1, implies 
Xi-cyc-i ^ 7 which is equivalent to X^'" = Y^'", since X,Y e Qm,+. Thus, X ^Y.We get 
the same conclusion for c = 1 by exploiting similar argumentations. ■ 

The exponentiation of a spectral density $(c^'') G §™(T) to an arbitrary real number c is 
punctually defined by exploiting the previous spectral decomposition: 

where $(0^'^) = [/(c^'^)diag(rfi(c-?'^), . . . , d„(c-'''))f/(cJ'^)^' with f/(c-?'^) G L™^™(T) such that 
U{i^^)U{(^^Y = Observe that ^'^ belongs to S![!(T). We are now ready to introduce the 
multivariate (asymmetric) Beta divergence among ^ e S!J!(T): 

5^($||^) j tr[^Hogi_{^!^,^^) -'^^] 

= j tr[^($^ - $*^-^) - -^($^ - (23) 

where ^ G M \ {0, 1}. Similarly to the scalar case, we can extend by continuity the definition 

of Beta divergence for /3 = and (3 = 1. 
Proposition 3.2: The following limits hold: 

lim5«($||*) = <Sis($||*) 

lim5^($||^) = 5kl($||^). (24) 
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Proof: Since $ and \1/ belong to S!j!(T), i.e. $ and \1/ are coercive and bounded, it is possible 



to show by standard argumentations that the integrand function of ( 23 1 uniformly converges on 
T for /3 — )• and (3^1. Hence, it is allowed to pass the limits, for /3 — i- and /3 -> 1, under 
the integral sign. Taking into account the first limit, we get 



lim [ ti\-^((!>^ - $^^-1) - - 

[ tT\~I + ^^-^ - lim 4 [(^'^ - /) - - /)1 
J { /3->o/3 

j tr[-/ + <l>^"^ - log($) + log(^)] 



= 5is($||^) 

where we exploited (21 ). For the second limit, we obtain 

= - j tr[<l> lim^logi [^^, + $ - ^] 

= - j tr[$ lini log2_/3 $) + $ - \[^] 

tr[$ (log($) - log(\[^)) + ^ - $] 
5kl($||^) 



where we exploited (22) 



(25) 



(26) 



In view of Proposition 3.1 and Proposition 3.2 we conclude that the multivariate Beta divergence 



is a continuous function of real variable (3 in the whole range including singularities and it 
smoothly connects the multivariate Itakura-Saito distance with the multivariate Kullback-Leibler 
divergence. 

Remark 3.1: For (3 = 2, the Beta divergence corresponds, up to a constant scalar factor, to 
the standard squared Euclidean distance (L2-norm) 



5l2($||^) = J ($-*,$-^) 



(27) 
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where (X, Y) = tr(XY) is the usual scalar product in Q^- 



Finally, we show that the multivariate Beta divergence satisfies condition (12). 
Proposition 3.3: Given $, \E' G the following facts hold: 

1) 5/3(-||\E') is strictly convex over §™(T), 

2) Sfs^^W"^) > and equality holds if and only if \[/ = $. 

Proof: In order to prove the statements, we need of the following first variations of the 
maps X tr(X'^) and X i— )■ tT{X^Y), respectively (further details may be found in Appendix 

5{tT[X''];SX) = ctT[X"^6X] 

<5(tr(X^F); 6X) = tr[0x,c(5X)y], (28) 



where Y G Qm and the map Ox,c is defined in (78). 



1) The first variation of 5^(<l>||\&), with respect to $, in direction 5$ G L™^™(T) is 

S{Spm<ify,S^) = —J^ tr[($^-i-M/^-i)5$]-. (29) 
The second variation in direction 5$ is 

S\Spmn,S<^) = / tr[0$,;3_i(5$)<5$] — 

Jo ^TT 

/»27r r pi poo 

Jo Uo Jo 

X ($ + t/)-Mt$('^-^)"dr5$l — 

/»27r /»1 /»oo 

= / / / tr[<l>(^-i)(i-^)(<l> + t/)-i5<l> 
Jo Jo Jo 

X ($ + t/)-i<l>('^-i)"(5$]dtdr — . 
By the cyclic property of the trace and since and ($ + tl)^^ commute, we get 

,2,- ... r . . 

where 



JO JO 



5\S(smnS^)= I I I A.($,5<f)dMr— (30) 



ft,r{X, A) = tr[X^^(X + t/)-^A(X + t/)-5 



^^(/3-l)(l-r)^^ + t/)-5A(X + (31) 
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with X e Qm,+, A e Q^,t e [0, oo) and r e [0, 1]. Thus, ft,r{X, A) > and /j,^(X, A) = 



if and only if A = 0. We conclude that integral (30), i.e. the second variation of 5/3(-||\l'), is 

positive for 5$ ^ 0. Accordingly, iS^(-||\E') is strictly convex over the convex set §™(T). 

2) As a consequence of the previous statement, the minimum point is unique and it is given by 



annihilating for each 5$ G L™^ Since ^^^-^ _ ^/^-i g L^^™(T), it follows that the 
minimum point satisfies the condition = \['^~^. Accordingly, $ = vj/. Finally it is sufficient 
to observe that =0. ■ 

Note that is not convex on §™(T) (not even in the scalar case). 

IV. Spectrum approximation problem 

Since the Beta divergence is well-defined for /3 G M, we choose /3 = — ^ + 1 with u G 
Z \ {0}. As we will see, this choice guarantees that the corresponding solution to Problem 
|2] (which is feasible) is rational for a suitable choice of ^. In order to simplify the notation 
we define 5,,($||\E') := >S;3($||\&) with (3 = + 1. We have to minimize >S,,(<I>||\&) over 
1$ G §™(T) I J G^G* = /}. Since it is a constrained convex optimization problem, we consider 
the corresponding Lagrange functional 



L,($,A) 

= 5,($||^) + j tr(^^) + - /, A 



tr 



$^ + + G*AG^ 



tr(A) (32) 



where we exploited the fact that the term J tr(\l''^--^) plays no role in the optimization problem. 
Note that, the Lagrange multiplier A G Q„ can be uniquely decomposed as A = Ap + A^ where 
Ar G Range T, A^ G [Range T]^. Since A^ is such that G*{e^'^)A±G{e^'^) = and tr(Ax) = 
{A±, I) = (see [31, Section III]), it does not affect the Lagrangian, i.e. L^(^, A) = Ar). 
Accordingly we can impose from now on that A G Range T. 

Consider now the unconstrained minimization problem min A) | $ G S"^(T)}. Since 

A) is strictly convex over §™(T), its unique minimum point is given by annihilating its 
first variation in each direction 5$ G L™^'"(T): 



5L^(<I>,A;5$) = y tr (^z/(^-^ -<l>-^) + G'*AG') 5$ 



(33) 
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where we exploited Note that, z/(^-^ - <l>-^) + G*AG E l;^^™(T). Thus, Q is zero 
V$ e L'J^^™(T) if and only if 

= ^-^ ^ -G*AG. (34) 
Since G §![!(¥), the set of the admissible Lagrange multipliers is 

:= |a e Qn I + ^G*AG > on t| . (35) 
Therefore, the natural set for A is 

Cl = C^n Range T. (36) 
In conclusion, the unique minimum point of the Lagrange functional has the form 

$^(A) := [^-^ + ^G*AG]-''. (37) 

Assuming that is rational in e-^'', there always exists a unique (up to a right-multiplication by 
a constant orthogonal matrix) stable and minimum phase spectral factor W such that \Ef(e^'^)^ = 
W{e^'^)W{e^^)*. By defining Gi(e-''^) = -^G{e^^)W{e^'^), we obtain an equivalent form of (|37|): 



-lTI/*lt/ 



<I>^(A) = [W{I + GlAGi)-'W 



(38) 



Corollary 4.1: Let ^ be a constant prior. Then, is rational in e-''^ with McMillan degree 
less than or equal to 2n\u\. Moreover, among all the spectral densities with u E Z \ {0}: 

1) The spectral densities with the smallest upper bound on the McMillan degree correspond 
to the Itakura-Saito and the squared Euclidean distance 

2) As z/ — )■ ±oo, $y tends to the spectral density corresponding to the Kullback-Leibler 
divergence. 



Proof: In view of (37) and (38), deg[$i^] < |z/|(deg[\l'^] + 2n) = 2n\v\ where n is the 
McMillan degree of G{z). 

1) Since z/ G Z \ {0}, the spectral densities with the smallest upper bound on the McMillan 
degree are attained for v = ±1, i.e. (3 = and (3 = 2, which are the optimal forms related to 
>Sis($||^) and 5l2($||^I/), respectively. Note that, $i(A) = [^"^ + G'*AG']"\ which is the same 
optimal form found in [fTTl for the multivariate Itakura-Saito distance, and $_i(A) = \1' — G'*AG'. 

2) Firstly, it is possible to show that the optimal form obtained by using the Kullback-Leibler 
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divergence is $kl(A) = e'°s'-*^ g*ag y^,]^ich is a straightforward generalization of the optimal 
form for Sklo presented in [fTSl. We want to show that — )■ $kl as z/ — )■ ±00. Let us consider 
the function F(A) := log(^-^ + AG* AG) with A G M such that + AG* AG > on T. Its 
first order Taylor expansion with respect to A = is + AG* AG — /. Accordingly, 

lim z/log(^-^ + -G*AG) 

^-l - 1 

= lim — + G*AG 

= - log(^) + G*AG (39) 



where we exploited (21) and the previous Taylor expansion. Finally, 

lim $,(A) = lim e'°s[(*-* + ^G*AG)-i 



fc'— >±0O I'— >itoo 



lim Q-'^iosi^'^ + lG'AG) 
V— >itoo 



e 



lim ulog{^-h + ^G*AG) 



= e'°s(*)-G*'^G = $kl(A). (40) 

■ 

In this section we showed that $i/(A) is the unique minimum point of L^(-,A), namely 

L,($,(A), A) < L,{^, A), V$ G §™(T) s.t. $ <I>,(A). (41) 



Hence, if we produce A° G Cl satisfying constraint in (11), inequality (41) implies 



5,($,(A°)||^) <5,($||^), V$ G §™(T) s.t. $ 7^ <I>,(A°) (42) 

namely such a $j,(A°) is the unique solution to Problem [2] with S„. The following step consists 
in showing the existence of such a A° by exploiting the duality theory. 

V. Dual problem 

Here, we do not deal with the case u = 1, since the existence of the solution to the dual 
problem was already showed in [17]. We start by considering the case u G N+ \ {1}F] The dual 



denotes the set of the positive natural numbers. 
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problem consists in maximizing the functional 

infL,($,A) = L,(<I>,,A) 

= J tr[(^-^ + ^G*AGY-''] - tr(A) (43) 

which is equivalent to minimize the following functional hereafter referred to as dual functional: 

MA) = j tr[(^-^ + ^G*AGY-''] + tr(A). (44) 

Theorem 5.1: The dual functional Ji, belongs to C°°{Cl) and it is strictly convex over C^. 



Proof: In view of (28), the first variation of J„{A) in direction 6Ai G Q„ is 



6 MA; 6 A 



= - J ti[{^-^ + ^G*AG)-''G*5AiG]+ti{5Ai) 

= - J ti [{Wil + GlAGi)-^W*y G*5A^G] +tr(5Ai). (45) 

The linear form VJu,a{-) '■= SJu{A; ■) is the gradient of at A. In order to prove that Ju{A) E 
C^{Cl) we have to show that 5{J^{A);SAi), for any fixed SAi, is continuous in A. To this 
aim, consider a sequence M„ G Range T such that M„ — > and define Qn{z) = W{z){I + 
Gi{z)*NGi{z)y'^W{z)* with G Q„. By Lemma 5.2 in ^ and since W is bounded on 
T, Qa+Mu converges uniformly to Q\. Thus, applying elementwise the bounded convergence 
theorem, we obtain 

Jim J GQl^,,G* = j GQIG*. (46) 

Accordingly, 5{Jy{A)]5A) is continuous, i.e. Ju belongs to C^{C^). In order to compute the 
second variation, consider the operator X : A^" . By applying the chain rule, we get 

V 

5(X(A); M) = - ^ A-^bAA^-""-^. (47) 

Thus, for 5Ai,5A2 G Q„ we have 

5V,(A;5Ai,5A2) 

1 r 

tT[QiG*6A2GQl+'-'G*6A,G]. (48) 

^ 1=1 
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The bilinear form l-Lu,h{-i ■) = 5^ Ji,(A; •, ■) is the Hessian of Jj^ at A. The continuity of can 
be established by using the previous argumentation. In similar way, we can show that has 
continuous directional derivatives of any order, i.e. E C''{Cl) for any k. Finally, it remains 
to be shown that is strictly convex on the open set C^. Since J„ E C°°{Cl), it is sufficient 
to show that 'Ha(5A, 5 A) > for each 6 A E Range T and equality holds if and only if 5 A = 0. 



Since u > and the integrands in (48) are positive semidefinite when SAi = 6A2, we have 
nA{6A,6A) > 0. If nA{6A,6A) = 0, then G*6AG = namely 6 A E [Range T]^ (see Ml 
Section III]). Since 5A E Range F, it follows that 5A = 0. In conclusion, the Hessian is positive 
definite and the dual functional is strictly convex on C^. ■ 



In view of Theorem 



5.1 the dual problem rnin {J^iA) \ A E C^,} admits at most one solution 
A°. Since is an open set, such a A° (if it does exist) annihilates the first directional derivative 



( |45] ) for each 6 A E Q„ 
I 



[^(^-^ + ^G*A°G')-'^G'*,5A ) = V5A G 



or, equivalently. 



/ = y + ^G*A°G)-''G* = J G<^^{A°)G*. 



(49) 



(50) 



This means that $j,(A°) E §™(T) satisfies the constraint in (11) and $i,(A°) is therefore the 
unique solution to Problem [2j 

The next step concerns the existence issue for the dual problem. Although the existence 
question is quite delicate, since set C^, is open and unbounded, we will show that a A° minimizing 

over Cl does exist. 

Theorem 5.2: Let u E N+ \ {1}, then the dual functional .7,^ has a unique minimum point in 

Proof: Since the solution of the dual problem (if it does exist) is unique, we only need to 
show that Jy takes a minimum value on C^. First of all, note that Jy is continuous on , see 



Theorem 5.1 Secondly, we show that tr[A] is bounded from below on C^. Since Problem [2] is 
feasible, there exists $7 E W^{T) such that / G^/G* = /. Thus, 



tr[A] = tr 



G^iG*A 



tr 



G*AG^i 



(51) 
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Defining a = — z/tr J \E' we obtain 



tr[A] = z/tr 



+ a. 



(52) 



Since + ^G'*AG is positive definite on T for A G C^, there exists a right spectral factor A 
such that + ^G*AG = A*A. Moreover, $/ is a coercive spectrum, namely there exists a 
constant fi > such that $/(e-''^) > /iJ, V e-''^ G T. Starting from the fact that the trace and the 
integral are monotonic functions, we get 



tr[A] 



v tr 



A<l>jA* 



+ a > z//itr 
1 



AA* 



+ a 



-G*AG 



+ a > a 



(53) 



where we have used the fact that ti J + ^G*AG > when A E C^,. Finally, notice that 
JuiO) = —jzi, J tr(^^). Accordingly, we can restrict the search of a minimum point to 
the set |A G I JuiA) < Ji^(O)}. We now show that this set is compact. Accordingly, the 
existence of the solution to the dual problem follows from the Weierstrass' Theorem. To prove 
the compactness of the set, it is sufficient to show that: 
1) lim JJA) = +oo; 



2) lim JJA) 

||A|Hc» 



-oo. 



A[Z)'-'' is 



1) Firstly, Rjy{z) := ^-^{z) + ^G{z)*AG{z) is a rational matrix function, thus R 
rational as well. Observe that dC^, is the set of A G Range T such that R^ie^^) > on T and 
there exists i!) such that RAie^^) is singular. Thus, for A — dC^ all the eigenvalues of RAiz)^^ are 
positive on T and at least one of them has a pole tending to the unit circle. Since l — u < —1, then 
also Ra^zY^'^ has at least one eigenvalue with a pole tending to T. Accordingly, tr[/ R^^'^] — )■ oo 



as A del- In view of (53), we conclude that Ju{A) = -j^tr [/ R^'"] + tr[A] cx) as 
A -> del. 

2) Consider a sequence {A,t}fcgN G such that 



lim llAi, I 



oo. 



fc— >CXD 



(54) 



Let AO 



IIAfcll 



. Since Cl is convex and G if A G Cl then ^A e Cl V ^ G [0, 1]. Therefore 



A° G Cl for k sufficiently large. Let i] := liminf tr[A°]. In view of (53), 

1 



tr[A°] 



\Au 



ti[Ak] > 



1 

[aT 



ra 0, 



(55) 
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for A; — i- oo, so ?7 > 0. Thus, there exists a subsequence of {A°} such that the limit of its 
trace is equal to 77. Moreover, this subsequence remains on the surface of the unit ball OB = 
{a = I ||A|| = 1} which is compact. Accordingly, it has a subsequence {A^^} converging in 
dB. Let A°° G dB be its limit, thus lim tr[A^ ] = tr[A°°] = 7]. We now prove that A°° G C^. 
First of all, note that A°° is the limit of a sequence in the finite dimensional linear space Range F, 
hence A°° G Range F. It remains to be shown that + ^G*A°°G is positive definite on T. 
Consider the unnormalized sequence {AfcJ G Cl: We have that + ^G*Ak^G > on T 
so that 11^"^ II + l^*^k,^ ^1^0 positive definite on T for each i. Taking the limit for 

II fc^ II * 

2 ^ 00, we get that G*A°°G is positive semidefinite on T so that \[^"- + lG*A°°G > on T. 
Hence, A°° G Since Problem g is feasible, there exists G §™(T) such that / = / 
accordingly 

= tr[A°°] =ti j G^iG*A°° = tr J ^JG^A^G^I. (56) 

Moreover, G*A'^G is not identically equal to zero. In fact, if G'*A°°G = 0, then A°° G 
[Range F]-*- and A°^ 7^ since it belongs to the surface of the unit ball. This is a contradiction 
because A°° G Range F. Thus, G*A°^G is not identically zero and r] > 0. Finally, we have 



lim Ju{Ak) 



A;— >oo 

r r ,1 



- lim tr 

fc— >CXD 1 1/ 



J (^qj-l + ^G*AkGy 



+ tr[AA 



> lim ||Afe|| tr[A^] = r] lim ||Afe|| = 00. (57) 

k—^OD k—^oo 



It remains to deal with the case z/ G Z such that u < 0. In this situation, the dual problem may 
not have solution: The minimum point for Jjy(A) may lie on dC^, since J,, takes finite values 
on the boundary of C^,. 

VI. Computation of A° 

We showed that the dual problem always admits a unique solution A° on for 1/ G N+. In 
order to find A°, we exploit the following matricial Newton algorithm with backtracking stage 
proposed in [1 3T| : 

1) SetAo = /G£|;; 
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2) At each iteration, compute the Newton step Aa- by solving the linear equation "Hj^^a^ ( Aa^ , ■) = 
— VJ;y,Ai(-) where, once fixed Aj, VJi^,Ai(-) and 'Hu,Ki{--, ■) must be understood as a linear 



and bilinear form of (45) and (48), respectively; 

3) Set t° = 1 and let t\^^ = t\/2 until both of the following conditions hold: 

K + t'i/^^^eCl (58) 
UK + AaJ < J.(A,) + at\ (VJ.,A„ Aa,) (59) 

with < a < 1/2; 

4) Set A,+i = A, + tfAA,; 

5) Repeat steps 2, 3 and 4 until || VJjy,Ai(-) II < ^ where £ is a tolerance threshold. Then set 
A° = A,. 

The computation of the search direction Aa^ is the most delicate part of the procedure. The 
corresponding linear equation reduces to 

IY.! GQ'^^G*Aa^GQI+'-'G* = I GQIG* - I (60) 
^ 1=1 ^ ^ 

where Qa = W{I + GlAGiy^W*. By similar argumentations used in [TF, Proposition 8.1], it 
is possible to prove that there exists a unique solution Aa, G Range T to ( [601 ). Accordingly, we 
can easily compute Aa, in this way: 

1) Compute 

Y = J GQip*-I- (61) 

2) Compute a basis {Si ... Sa/} for Range F from (jv]) and for each S^, = 1 . . . M, compute 

Yk = -J2 f GQ\fl*i:uGQlf-'G*- (62) 
^ 1=1 

3) Find {a^} such that Y = Y^k^^k^k- Then set Aa, = Y^k^k^k- 

Concerning the evaluation of the integrals in (59), ( [6T] ) and (62), a sensible and efficient method 
based on spectral factorization techniques may be employed. For further details, including the 



checking of condition (58), we refer to Section VI in BTI . 
Finally, it is possible to prove that: 

1) Jy{-) e C°°iCl) is strongly convex on the sublevel set IC = {A e Cl \ J^{K) < J^{Aq)]\ 

2) The Hessian is Lipschitz continuous in /C. 
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The proof follows the ones in [[3T1 Section VII] and [fTTl Section VI-C] faithfully. These properties 
allow us to conclude that the proposed Newton algorithm globally converges, [32, Chapter 9]. 
In particular the rate of convergence is quadratic during the last stage. In this way, the solution 
to Problem [2] may be efficiently computed. 

VII. Simulations results - Part I 

In order to test the features of the family of solutions with u G N+, we take into account 
the following comparison procedure: 

1) Choose a zero mean stationary process y = [yk] k E Z} with spectral density G §" (T); 

2) Design a filters bank G{z) as in ([T]); 

3) Set \E' = J i7 (\E' is constant and equal to the zeroth moment of 

4) Set S = S G Range F fl Qn,+, i-e- the corresponding spectrum approximation problem is 
feasible; 

5) Solve Problem [2] (with S^) by means of the proposed algorithm with the chosen \E' and 
T.^^G{z) as filters bank. 

In the above comparison procedure we assume to know S and J In this way, we avoid the 
approximation errors introduced by the estimation of E and / ^7 from the finite-length data 
yi . . .y^. Concerning the design of the filter, its role consists in providing the interpolation 
conditions for the solution to the spectrum approximation problem. More specifically, a higher 
resolution can be attained by selecting poles in the proximity of the unit circle, with arguments 
in the range of frequency of interest, [[T4ll . 

A. Scalar case 

We start by taking into account Example described in [|3T1 Section VIII-B] (the unique 
difference is that we assume to know S and / Q,). Consider the following ARMA process: 

y(t) = 0.5y{t-l)-0A2y{t-2) + 0m2y{t-3) 

- 0.0425i/(t-4) + 0.1192y(t-5) 

+ e(t) + l.le(t-l) + 0.08e(t-2)-0.15e(t-3) (63) 

where e is a zero-mean Gaussian white noise with unit variance. In Figure [T] the spectral density 
Vl G §J'^(T) of the ARMA process is depicted (gray line). G{z) is structured according to the 
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covariance extension setting (|4]) with 6 covariance lags (i.e. n = 6). In Figure [T] the different 
solutions obtained by fixing 1^ = 1, dashed line, v = 2, solid line, and u = 3, thick line, are 
shown. The solution obtained by minimizing the multivariate Itakura-Saito distance {v = 1) is 
characterized by peaks which are taller than these in In fact, this solution seems the most 
adequate for detecting spectral lines, see example of Section VII-A in [fTTl . On the contrary, the 
peaks are reduced by increasing u. Note that, the solutions with u = 2 and z/ = 3 are closer to 
f2 than the one with u = 1. 

As second example we consider the scalar bandpass random process with spectral density Vt 
depicted in Figure [2] (gray curve). The cm toj^ frequencies are = 0.89 and "§2 = 2.46. Moreover, 
Vtie^"^) > 2- 10~^ in the stopband, accordingly G §^(T). Matrix B is a column of ones. Matrix 
A is chosen as a block-diagonal matrix with one eigenvalue equal to zero and eight eigenvalues 
equispaced on the circle of radius 0.8 

±0.8, 0.86^^5, 0.8e^^'5, 0.8e^^'3''. (64) 

Here, \1' = / ~ 1.5284. Figure |2] also shows the obtained solutions. The one with u = 1 turns 
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Fig. 2. Approximation of tiie spectral density of a scalar bandpass random process. 



out inadequate. The solutions with u = 2 and z/ = 3 are, instead, similar and closer to Vt. 
B. Multivariate case 

We consider a bivariate bandpass random process with spectral density plotted in Figure [3] 
(gray curve). Here, the cwtoj^ frequencies are i^i = 0.42 and = 1-94, and ^7(e■''') > 2 ■ 10^'^/ 
in the whole range of frequencies. The constant prior is 

, 0.9313 0.3314 , 
^ = / ~ . (65) 



0.3314 0.5128 

The matrix A of the filters bank has one eigenvalue equal to zero, two eigenvalues in ±0.8 
and three pairs of complex eigenvalues closer to the passband 0.8e^-'°'^, 0.8e^-'^'^, 0.8e^-'^. The 
solutions for u = 1 (dashed line) u = 2 (solid line) and u = 3 (thick line) are shown in Figure 
|3j It is apparent that the solution for u = 2 is the most appropriate. 

VIIL Structured covariance estimation problem 



As mentioned in Section II-A we only have a prior \E' and a finite-length data yi . . .i/n in 



the THREE-like spectral estimation procedure. Moreover, represents a family of estimates of 
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Fig. 3. Approximation of tiie spectral density of a bivariate bandpass random process. 



r2 and we showed how to compute it starting from ^ and S G Range T fl Qn,+- Accordingly, it 
remains to find S from yi . . . Un- To deal with this issue, we consider Problem [l] which can be 
viewed as the static version of Problem |2j Indeed, in both problems minimization of a divergence 
index, with respect to the first argument, is performed on the intersection among a vector space 
and an open cone. In this section, we briefly show that it is also possible to find a family of 
solutions to the structured covariance estimation problem. 

The Beta matrix divergence (family) among two covariance matrices P,Qe Qn,+ with (3 E 
M\ {0, 1} is defined as 

i^^iPWQ) ■■= - PQ''") - \^P' - Q')\- (66) 

In fact, Vp{P\\Q) is the Beta divergence iS^($||\l/) among the two constant spectral densities 
$(e-''^) = P and ^'(e-''') = Q. Since Vp is a special case of Sfi, it is strictly convex with respect 
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to the first argument. Moreover, it is a continuous function of real variable /3 G M with 

/3-5>0 

limV0{P\\Q) = VkUPHQ) (67) 



where = 2Vi (see (10)) is the Burg matrix divergence, and 



Vkl{P\\Q) ■■= tr[P(log(P) - log(Q)) - P + g] (68) 

is the extension of the Umegaki-von Neumann'^ relative entropy, [27], to non equal-trace matrices. 

Take into account Problem [l] with V^{P\\tc) := Vp{P\\tc) such that /3 = + 1 and 
u G N+. In [18J, the existence and uniqueness of the solution to the problem for u = \ has 
been showed. Moreover, the form of the optimal solution is Pb(A) = [S^^ + F*(A)]^^, where 
V*(A) := n;^An^ - A^Yi^/\Ii^A is the adjoint operator of the linear map V defined in ^ 
and A G Qn is the Lagrange multiplier. Consider now Problem [T] with u G N+ \ {1}. The 
corresponding Lagrange functional is 

L,(P,A) := I?,(P||Sc) + ^tr(Sc^) + (K(P),A) 

= I?,(P||Sc) + ^tr(s5^) + (P,y*(A)). (69) 

Since L^(P, A + A) = L,,(P, A) VA G keiiV*), we can assume that A G [ker(V^*)]-^. Moreover, 
Li,(-, A) is strictly convex over Qn,+- Thus, the unique minimum point of Li,(-,A), which is 
given by annihilating the first directional derivative of A), is 

P,(A) := [tl> + U^\A)]'r (70) 

Since Pu{^) G Qn.+, the set of the admissible Lagrange multipliers is 

:= |a G Q„ I tc"" + lv*{A) > o| n [ker(V*)]^ (71) 

which is an open and bounded set (the proof is similar to the one of Proposition 5.1 in IfTSlI ). 
Then, the dual problem is 



where 



A° = argmin J^(A) (72) 



J,(A) := -infL,(P,A) 

= -^tT[±-' + lv%A)]'-\ (73) 
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Note that Ju{0) = -^j^i ^"^{^c )■ Accordingly, we can restrict the search of a minimum point to 
the set £* := |A G | Ju{^) < Jp{^)} C C\, which is bounded. Following the same lines in 
ifTSll . it is possible to prove that Jj, E C°°{C^) is strictly convex on C„ and lim J„{A) = +00 
(the limit diverges because the exponent in ( [73] ) is negative). Thus, C* is a compact set (i.e. 
closed and bounded) and admits a minimum point A° over C* by the Weierstrass' Theorem. 
The uniqueness of A° follows from the fact that Jj, is strictly convex over C^. Also in this case, 
a globally convergent matricial Newton algorithm for finding A° may be employed. Therefore, 
once we computed A° the solution to Problem [T] is given by P^(A°). Finally, the same analysis 
may be extended to Vkl- In this case, Pkl{^) = e'°g(^c)-v*{A) 

To sum up, a family of solutions P^, E Range T fl Qn,+ to the structured covariance estimation 
problem has been found. In this way, we have a complete tool to compute the family of estimates 
of ri starting from a prior ^ and a finite-length data yi . . .y^'- We compute Pj, from yi . . .y^ 
and we then find <l>,^ starting from P^, and ^. 



IX. Simulation results - Part II 



We consider the bivariate bandpass random process y of Section VII-B and we take into 
account the following THREE-like spectral estimation procedure: 

1) We start from a finite sequence yi . . .y^ extracted from a realization of the process y; 



2) Fix G{z) as in Section [VIFB , 

3) ^^i^ = ^Y.tiykyh 

4) Feed the filters bank with the data sequence yi . . .yN, collect the output data xi . . .xm 
and compute Sc = ^ Ylk=i ^k^l; 

5) Compute P^ E Range T fl Qn,+ by solving Problem [T| (with V^), then set = P^ 

6) Compute $1, by solving Problem [2] (with S^) by means of the proposed algorithm with 
the chosen \l/ and 'E'^G{z) as filters bank. 

Note that, the prior ^ represents a coarse, low order, estimate of n obtained by the means of 
a standard and simple estimation method. Accordingly, is a spectral density (with bounded 
McMillan degree) which is consistent with the interpolation constraint in ([3]) and is as close as 
possible to the initial estimate ^ according to the divergence index S^. In the above procedure, ^ 
has been chosen as the constant spectral density equal to the variance of the given data sequence. 
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In Figure |4| the obtained estimates with = 50 (i.e. we have considered a short-length data) 
are depicted. Also in this case, the peaks of the estimates are reduced by increasing u. For the 
extracted sequence, the estimators for z/ = 2 and = 3 appear to perform better than the one for 
V = \. Finally, the same procedure can be applied to the other processes considered in Section 




Fig. 4. Estimation of the spectral density of a bivariate bandpass random process. 



m 

X. Conclusions 

A multivariate Beta divergence family connecting the Itakura-Saito distance with the Kullback- 
Leibler divergence has been introduced. The corresponding solutions to the spectrum approxima- 
tion problem are rational when a suitable parametrization of the parameter (3 is employed. The 
solution with the smallest upper bound on the McMillan degree is associated to the Itakura-Saito 
distance. Moreover, the limit of this family tends to the solution corresponding to the Kullback- 
Leibler divergence. Then, similar results may be found for the structured covariance estimation 
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problem. Simulations suggest that the presented family of estimators provides a relevant tool in 
multivariate spectral estimation. 
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Appendix 

A. On the exponentiation of positive definite matrices 

We collect some technical result concerning the exponentiation of positive definite matrices 
to an arbitrary real number. We start by introducing the differential of the matrix exponential 
and the matrix logarithm (see lITSll '). 

Proposition A.l: Given Y E Qn, the differential of F i— )► in the direction A G Q„ is given 
by the linear map 

My : A / e(^-")^Ae"^dr. (74) 



Proposition A.l: Given Y E Qn,+, the differential of Y log (Y) in the direction A E Qr 
is given by the linear map 



POO 

Ny:A^ (Y + tI)-^A{Y + tl)-^dt. 
Jo 



(75) 



Let us consider now a positive definite matrix X E Qn,+ and a real number c. The exponentiation 
of X to c may be rewritten in the following way 



= e'^°^^. (76) 

Accordingly, by applying the chain rule, the differential of X i— )• X^ in the direction A G Q„ is 
given by 



M,iosx{cNx{A)) = c 



P 1 POO 

/ / {X + tI)-^A{X + tI)-^dtX'^dT. (77) 

Jo Jo 



We summarize this result below. 

Proposition A. 3: The differential of X i— )■ X^ in direction A G Qn is given by the linear map 



pi POO 

Ox,c : A ^ c / X'^^^-") / (X + t/)-^A(X + tiy^dtX'^dr. 
Jo Jo 



(78) 
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Corollary A.1: The first variation of X ^ tr(X'^) in direction A e Q„ is 

5{ti{X^) ; A) = c tr (X'^-^ A) . (79) 
Proof: Since and {X + tJ) commute, we get 
5{tT{Xj,A) = tr(Ox,,(A)) 

= ctri^X'J^ {X + tI)-MtA^ 
= ctr{X^X-^A} ^ctrjX'^-^A}. 
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